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Abstract 



The distribution of arrival directions of ultra-high energy cosmic rays may yield 
clues to their mysterious origin. We introduce a method of invariant statistics to ana- 
, lyze cosmic ray data which eliminates coordinate-dependent artifacts. When combined 

' with maximum likelihood analysis, the method is capable of quantifying deviations of 

the distribution from isotropy with high reliability. We test our method against pub- 
lished AGASA events with energies above 4x 10^^ eV. Angular cuts from observational 
limitations are taken into account. A model based on the Fisher distribution reveals 
' the rotation of the Earth with the axis n along the direction (5'* 53.36"^, 85.78°) 

in {RA, DEC) coordinates, which is within 5° of the equatorial north pole. Global 



Q ' anisotropy of the data, if any, hinges on finer understanding of detector acceptance 



than what is available from the published literature. 



Introduction 



^ . A puzzle has existed for more than 30 years regarding cosmic ray events with energies 

^ I exceeding 4 x 10^^ eV, a value in the range of the Greisen, Zatsepin and Kuzmin (GZK) 

bound 1^ [21 • The nature of the primary particles causing these events is controversial (see 
e.g. Refs. [SI 15 for recent review of the field). If the primaries are protons from cosmologi- 
cal distances, then their sources should be isotropically distributed. The scrambling effects 
of intervening magnetic fields are difficult to assess, but on very basic physics the magni- 
tudes of the fields (or associated correlation lengths) would have to be exceedingly small 
to avoid isotropization. Diffuse propagation of cosmic rays from a few prominent sources 
may result in isotropic arrival directions as suggested in Refs. [Sl[ni[Z|- Cosmic rays from 
our own Galaxy core are not considered a viable explanation for the events approaching 
the GZK limit 8 . If our Galaxy substantially modifies propagation or contributes to 
production of the primaries above 4 x 10^^ eV, then any consequent anisotropy should be 
correlated with known source(s) of the Galaxy. On several bases, and in particular on the 
basis of symmetry, one asks whether the highest energy events are isotropic, up to the cuts 
imposed by observational limitations. 
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Several studies IIUI llll I12j at much lower energies have demonstrated a significant 
isotropy in arrival directions. The AG AS A group has presented an analysis of a sizable 
(4%) anisotropy of several hundred thousand events with energies above 10^^ eV. The 
study uses right ascension, declination {RA, DEC) coordinates ^0] and finds a statistically 
significant first Fourier moment in sm{RA). The definition of right ascension however, 
involves historical human conventions and a choice of axis based on the Earth's orbit. The 
coordinate system axis defining RA requires two parameters for specification, which are 
implicitly used in the analysis, potentially affecting the claimed statistical significance of 
effects. In addition, sm(j) is not a particular sensitive or informative statistic, potentially 
diminishing the impact of 216,000 events. 

Here we examine all available data from the AGASA experiment with energy above 
4 X 10^^ eV. We take care to define certain quantities of the analysis which previously 
have been used in a coordinate dependent way. Since the coordinate system is a human 
artifact, this is to be avoided, unless there is a preferred coordinate system based on other 
information. Our methods are taken from Refs. JS| and a large literature on invariant 
angular statistics. Recently Sommers |14| has advocated an approach using the same basic 
concepts. 

Statistical comparisons require formulation of a well defined null hypothesis. We take 
particular care on this point. The term "null" often indicates conditions of "no signal", 
which would be appropriate if one considers anisotropy a signal. More deeply, the scientific 
method proceeds by ruling out possibilities, rather than attempting to prove particular 
notions. The symmetries of data are the simplest and purest characterizations capable of 
being tested, or ruled out. Thus a solid null hypothesis is the foundation of any further 
claims. 

The elimination of coordinate dependent artifacts, the testing of a symmetry-based 
null, and the objectivity of maximum likelihood analysis create very powerful tools. If 
this method is valid, then it should be able to determine the systematic bias in real data 
objectively. The main purpose of the paper is to demonstrate the power of the tools for 
future use. However we also test our method to find significant evidence for anisotropy 
in the distribution of the highest energy events, which has been a question of intrinsic 
interest for many years. 

Methodology 

The method of maximum likelihood (see Refs. jl51 116j for example) allows one to test 
objectively between different statistical models. It is good practice to test models which are 
continuously related to the null when parameters are varied. Continuity in the likelihood 
test tends to assure that the same features of the data are tested by the null and competing 
models. 

Let X denote an element of a data set, for example the coordinates of an observed 
track. Given a normalized distribution f{x), the likelihood of the data in the distribution 
is defined to be the product of the distribution evaluated at each data point over the data 
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set. Maximum likelihood occurs at maximum log likelihood £, defined as 

N 

£ = J2log[f{x,)], (1) 

j 

where N is the number of data points. 

Hypothesis testing is done by comparing the difference of maximized log-likelihoods 
(T) for a given data set evaluated with two different distributions f{x), fnu\ii^)i where 
/nuii(x) is the nuh distribution. For large iV > 1, the statistic 2T = 2{C^^'^^^ - £™f ) has 
a chi-squared distribution for p parameters (Xp)- We will verify this and also determine 
the distribution independently by Monte Carlo. Note that it is not possible and not our 
objective to find the "exact" distribution behind the observed data. Instead, emphasis lies 
on testing the null distribution. It is sufficient to show that a trial distribution fits the 
data sufficiently well to rule out the null; one does not imply, or conclude from this that 
the trial distribution is the last word. 

Covariant and Invariant Statistics 

For covariant quantities we map the astronomical coordinates {DEC, RA) (7r/2 — 
0,(p) X, where x is a unit vector on the dome of the sky, 

X = (sin^cos(/), sin ^ sine/), cos 9). 

Naturally x transforms like a vector when the coordinate system is changed. The expan- 
sion in spherical harmonics are the same thing: vectors are the I = l,m representations of 
spherical harmonics. By contracting vector indices it is straightforward to make trial dis- 
tributions and statistically valid quantities that are scalars under rotations, and therefore 
independent of the coordinate system. 

The procedure is very natural, and so simple that we should expand on alternatives 
which do not have the same features. It is very common in the subject of "circular statis- 
tics" to see quantities such as <9>= J2f (^i/^^ = V <0'^> — <9>'^. Such quantities can 
be computed but they are so faulty as to be nearly meaningless. The fault is immediately 
seen by calculating <9> for an isotropic distribution on a circle. The distribution has no 
preferred orientation, yet the average angle <0>— > vr yields a preferred point, which is 
unacceptable. 

Analysis 

We apply our method to the set of 58 published tracks from the AGASA group with 
energies not less than 4 x 10^^ eV (Table 1, Refs. [IT]). The average angular resolution 
of the track directions is stated to be 1.8 degrees [Hj containing one sigma (68%) of the 
events. The tracks come from the region of polar angle 100° < 9 < 10°. This data set 
could be, in principle, anisotropic in any direction. 

The world's published data consists of some 200 points. However not all are available, 
while resolution, cuts and data quality varies, and the dangers of combining data from 
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different groups suggest tliat restricting the study to a liomogeneous set is tlie logical first 
step. We also want to make it clear that objective and highly significant conclusions can 
be extracted from carefully constructed tests without needing particularly large data sets. 
Our methods, of course, extend readily to much larger data sets anticipated from new 
arrays including HIRES and AUGER. 

The most basic model to analyze unimodal spherical data {x{6,(p)} is the 

Fisher distribution: 



where h{a, (3) is the direction of symmetry-breaking axis. This distribution has a long 
history as the spherical generalization of the venerated von-Mises distribution on the 
circle. Both distributions have the analytic properties of the Gaussian distribution, as 
seen by writing 

suppressing normalization factors. Parameter k is called the concentration parameter, and 
determines the extent that data is concentrated along the anisotropy axis h. 

The Fisher distribution has the minimal number of parameters possible on the sphere, 
and exhibits cylindrical symmetry about the axis h. The pre-factor ^1 — (n • x)'^ is ex- 
plained as follows. Choose axis direction n = (0, 0, 1) or along the north pole . This hides 
2 parameters, and reduces the distribution to the simpler form: 

/Fisher(0, (t>) = , ^ . sinSe'^-^^ (3) 
47r smh k 

The pre-factor sin 9 takes into account the Jacobian of solid angle on the sphere. 
Cylindrical symmetry about h is also obvious now, since (f) does not appear anywhere in 
(jS)). For the limit of small anisotropy k <C 1, the exponential can be expanded 

ft cos 1 a 

e ~ 1 — K COS y, 

which is one of the distributions discussed by Sommers ^] , and which needs the additional 
Jacobian factor to test anisotropy. 

The data set is subject to a cut in declination, (10° < < 100°). We modify the Fisher 
distribution to take this cut into account as: 

fcut ,Q ,^ ^ /Fisher (19, 0)g(6') 

/Fisherl'^,'?; //Fisher(^,0)<7(^)^i^d'/'' ^ ^ 

ff(e) = [7?(0-lO°)-r/(0-lOO°)], (5) 

where r/ is the Heaviside step function. The step functions have an invariant representation 
we do not bother to write here. Now (|1J) is the normalized Fisher distribution in the cut 
region and serves as a trial model of anisotropy. A variant of the Fisher distribution known 
as Watson distribution is useful in cases where the data is in either bipolar or girdle form 
and is given by 

Vl-(n-x)^e«("-^)^ 
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Figure 1: An isotropic distribution in 9 given in {Tj) and the detector response observed by 
AG AS A. While the isotropic distribution peaks a,t 6 = 90°, the detector response shows a peak at 
around 6* = 57°. 

which can be modified to take in to account the cuts in the data in the same way in 

We also need a null distribution on the cut region to test our model using the method 
of maximum likelihood. The null isotropic distribution on the sphere with the cut in angle 
6 is given by, 

sin 61 g{9) 



J"CUt 

/null 



(OA) 



(7) 



J sin 9 g 

The null distribution coincides with fpfsheri^^^) limit k — > 0. Thus when likelihood 

is evaluated, the variation of parameter k allows the analysis to be continuously connected 
with the null. However, the detectors usually do not have uniform acceptance in declina- 
tion. AGASA TT" gives a distribution in declination of all events above > 10^^ eV. We fit 
the distribution by a function of the form (a sin b9 + c) which is related to the isotropic 
distribution in 9, namely d{cos9). While the isotropic distribution peaks at 9 = 90°, the 
detector response drops as one move away from 9 = 57°. The distributions are plotted in 

Fig.m 

Results 

Evaluation of likelihood for the AGASA data set is straightforward. One finds a maximum 
log likelihood in the Fisher distribution to be 



units at 



-Cpisher = —128.10 

K = 1.35 ± 0.41; n = (4.22°, 88.34°). 
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In equatorial coordinates {RA, DEC), the axis of concentration lies along the direction 
(5'^ 53.36'", 85.78°). We have also evaluated the maximum log likelihood for Watson 
distribution © and the result is: £watson = —129.71 units. 

The null hypothesis is evaluated by running the same analysis with our fit function to 
the detector response in 9 shown in Fig. ^ The log likelihood is found to be 

-CnuU = —128.81 

units. The quantity 2T™'^^ = 2(£inodei — >Cnuii) is 1-42 units in case of Fisher distribu- 
tion (2T^^j^j;j,) and 1.81 units in case of Watson distribution C^T^^f^^^). However, if the 
detector response is not explicitly known or is not well-defined, one uses the isotropic 
null distribution 0. The log likelihood is £'nuii = —134.39 units for isotropic null case. 
Correspondingly 2Tp'^l^^ = 12.57 and 2r^^^„^ = 9.36. 

For large S> 1 it is known that 2T is distributed like Xpi^T) of p parameters. In 
practice 30-40 points is sufficient for large to apply, but the question also depends on 
the dimensionality of the data. We have 3 parameters in our case, so one should expect 
a Xs distribution. This is a prediction of pure statistical theory, without detail about the 
problem at hand, and something that may be questioned. 

Real data sets often contain correlations or irregularities of myriad possible origin. 
Such features may upset analytic estimates. We therefore made an independent Monte 
Carlo study of the statistic 2T based on the features of the problem. We took ten thousand 
random samples of 58 data points inside the cut region. Randomness was implemented 
by selecting points from a distribution flat in </>, and flat in cosO. For each sample of 58 
we varied k and h to determine the maximum likelihood, and we calculated the likelihood 
for K = 0; the result was a trial value of 2T. The distribution of 2T values for 10,000 runs 
is shown in Fig |21 along with the Xs distribution. Agreement is excellent. 

Statistical significance is determined by calculating P-values, also known as confidence 
levels, which are defined as the integrated probability of 2T to fluctuate in the null dis- 
tribution above the value determined for the data. A comparison of P-values is also a 
method to determine whether the distributions have any unexpected long tails. The P- 
value for the Xs distribution to give 2T > 1.42 or 2T > 1.81 is very large. It means the 
AGASA data set is rather isotropic and do not strongly point towards a unique source 
direction or do not suggest that the data points are concentrated in a girdle form. This 
is the conclusion from using detector response in declination reported by AGASA group. 
On the other hand, from isotropic null distribution, the P-value for the xi distribution to 
give 2T > 12.57 is 3.9 x 10~^, or 0.039%. In terms of 2-sided Gaussian statistics, which 
are often used for comparisons, P = 3.9 X lO^-^ corresponds to a "3.55 a" effect. 

Observations 

We plotted the data in several coordinate systems to visually examine signals of anisotropy, 
and to check the fit to the Fisher distribution. Results in Galactic Coordinates and 
Aitoff-Hammer equal-area projection are shown in Fig 13 The advantage of an equal-area 
projection is that an isotropic distribution will appear uniform, and anisotropy is not 
unduly distorted by the coordinates. Galactic coordinates are used so that any correlation 
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Figure 2: We plot here the 2T distribution from Monte Carlo (histogram) and chi-square distribu- 
tion of 3 parameters (solid curve). The Monte Carlo 2T distribution was found from ten thousand 
sets of 58 random tracks and taking twice the difference between maximum log likelihood of the 
Fisher and the isotropic null distributions. 

with the galaxy plane might be easily seen. The equatorial plane is shown by the solid 
curve. 

The figure shows the Fisher "hot-spot" extracted from the fit, namely the best-fit 
anisotropy axis n. As can be seen, the Fisher axis is close to the equatorial north pole and 
its error cone actually encloses the north pole. Also shown are the locations of some other 
features of obvious interest: the hot-spot of the lower energy AGASA analysis HIT], and the 
approximate location of the galaxy core. We added the direction of our system's motion 
relative to the cosmological background radiation, as deduced from the dipole component 
of the the 3° cosmic microwave background, for another comparison. 

Conclusion 

Unresolved questions regarding the origin of cosmic rays with energies above 4 x 10^^ eV 
suggests testing the degree of isotropy of the data. The method we have developed is 
capable of ruling out isotropy and eliminating any coordinate dependent artifacts. Given 
the detector bias reported by AGASA, the group's cosmic ray tracks above 4 x 10^^ eV 
are consistent with an isotropic distribution. The question of anisotropy probably hinges 
on a finer understanding of angular bias than currently available in published sources: for 
example, error bars on the angular acceptance would be useful. 

Acknowledgements: This study springs out of collaboration with Pankaj Jain, who 
we thank for productive comments and criticsm. Work was supported in part under 
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Figure 3: AitofF-Hammer equal area projection of the sky in Galactic Coordinates. The AGASA 
tracks are denoted by open circles with a radius equal to their angular resolution of 1.8 degrees. 
The anisotropic hot spot AGASA group found at lower energies JJ_, has been shown as the shaded 
region at the left corner. The Fisher axis of global anisotropic direction we have found is about 90° 
from the AGASA hot spot and close to the equatorial north pole. The error in the determination 
of the Fisher axis is shown by the circle We also show the direction of motion relative to 3° cosmic 
microwave background and the equatorial plane. 
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Table 1: AGASA Events Above 4 x 10^^ gy (J2000 coordinates) 
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